Probabilistic correlation-based similarity measure on text records

نویسندگان

  • Shaoxu Song
  • Han Zhu
  • Lei Chen
چکیده

Article history: Received 29 July 2013 Received in revised form 12 July 2014 Accepted 3 August 2014 Available online 20 August 2014

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

A new vector valued similarity measure for intuitionistic fuzzy sets based on OWA operators

Plenty of researches have been carried out, focusing on the measures of distance, similarity, and correlation between intuitionistic fuzzy sets (IFSs).However, most of them are single-valued measures and lack of potential for efficiency validation.In this paper, a new vector valued similarity measure for IFSs is proposed based on OWA operators.The vector is defined as a two-tuple consisting of ...

متن کامل

Privacy Preserving Probabilistic Record Linkage Using Locality Sensitive Hashes

As part of increased efforts to provide precision medicine to patients, large clinical research networks (CRNs) are building regional and national collections of electronic health records (EHRs) and patientreported outcomes (PROs). To protect patient privacy, each data contributor to the CRN (for example, a health-care provider) uses anonymizing and encryption technology before publishing the d...

متن کامل

An Efficient Document Clustering Based on HUBNESS Proportional K-Means Algorithm

Evaluating similarity between the documents is a main operation in the text processing field. Similarity measurement is used to estimate the relationship between the records or documents.In existing system similarity between two documents can be computed with respect to feature by using Similarity Measure for Text Processing (SMTP). In proposed hybrid SMTP scheme is integrated with hubness base...

متن کامل

Probabilistic Measure of Colour Image Processing Fidelity

In the paper a probabilistic approach to quality assessment of image processing algorithms is proposed. Presented scalar measure can be used for any colour space and gives very similar results regardless on the image content. It can be an interesting supplement to existing image quality metrics in applications where the details of the processing algorithm are known. Its good correlation with su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Sci.

دوره 289  شماره 

صفحات  -

تاریخ انتشار 2014